An Environment Model for Nonstationary Reinforcement Learning

نویسندگان

Samuel P. M. Choi

Dit-Yan Yeung

Nevin Lianwen Zhang

چکیده

Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonsta-tionary environments. The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes. A mode basically indexes a Markov decision process (MDP) and evolves with time according to a Markov chain. While HM-MDP is a special case of partially observable Markov decision processes (POMDP), modeling an HM-MDP environment via the more general POMDP model unnecessarily increases the problem complexity. A variant of the Baum-Welch algorithm is developed for model learning requiring less data and time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Models of Nonstationary Markov Decision Processes

Standard reinforcement learning algorithms generate polices that optimize expected future rewards in a priori unknown domains, but they assume that the domain does not change over time. Prior work cast the reinforcement learning problem as a Bayesian estimation problem, using experience data to condition a probability distribution over domains. In this paper we propose an elaboration of the typ...

متن کامل

Multi-model Approach to Non-stationary Reinforcement Learning

This paper proposes a novel alogrithm for a class of nonstationary reinforcement learning problems in which the environmental changes are rare and finite. Through discarding corrupted models and combining similar ones, the proposed algorithm maintains a collection of frequently encountered environment models and enables an effective adaptation when a similar environment recurs. The algorithm ha...

متن کامل

Multiple Model-Based Reinforcement Learning

We propose a modular reinforcement learning architecture for nonlinear, nonstationary control tasks, which we call multiple model-based reinforcement learning (MMRL). The basic idea is to decompose a complex task into multiple domains in space and time based on the predictability of the environmental dynamics. The system is composed of multiple modules, each of which consists of a state predict...

متن کامل

Reinforcement Learning in Nonstationary Environment Navigation Tasks

The field of reinforcement learning (RL) has achieved great strides in learning control knowledge from closed-loop interaction with environments. “Classical” RL, based on atomic state space representations, suffers from an inability to adapt to nonstationarities in the target Markov decision process (i.e., environment). Relational RL is widely seen as being a potential solution to this shortcom...

متن کامل

Hidden-Mode Markov Decision Processes

Samuel P. M. Choi Dit-Yan Yeung Nevin L. Zhang [email protected] [email protected] [email protected] Department of Computer Science, Hong Kong University of Science and Technology Clear Water Bay, Kowloon, Hong Kong Abstract Traditional reinforcement learning (RL) assumes that environment dynamics do not change over time (i.e., stationary). This assumption, however, is not realistic in many real-...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

An Environment Model for Nonstationary Reinforcement Learning

نویسندگان

چکیده

منابع مشابه

Bayesian Models of Nonstationary Markov Decision Processes

Multi-model Approach to Non-stationary Reinforcement Learning

Multiple Model-Based Reinforcement Learning

Reinforcement Learning in Nonstationary Environment Navigation Tasks

Hidden-Mode Markov Decision Processes

عنوان ژورنال:

اشتراک گذاری